Gene Ontology-based Semantic Similarity Measures

نویسنده

  • Xiang Guo
چکیده

Quantitative measure of functional similarity between gene products is important for post-genomics study. The similarity measures may be used to validate high-throughput protein interaction data, help the development of new pathway modelling tools and clustering methods, and enable the identification of functionally related gene products independent of homology [Guo et al., 2006, Schlicker et al., 2006]. The functional relationship is usually estimated by shared annotation of gene products in a controlled vocabulary system, such as Gene Ontology (GO). GO terms and their relationships are represented in the form of directed acyclic graphs (DAG). It comprises of three categories: molecular function (MF), biological process (BP), and cellular component (CC). However, simply identifying shared GO annotations may not be adequate for the estimation of semantic similarity. Even if two annotations are different, they can be closely related via their common ancestors in DAG. On the other hand, the shared terms may be too general to be used as evidence for the functional association of annotated gene products. GO graph structure can be taken into account to improve semantic similarity measures. For example, two methods (simUI and simLP) have been implemented in GOstats package to estimate the similarity between induced graphs each of which includes the specific set of GO annotations for a gene product and all parents of these GO terms. Recently, a new set of measures have been used to calculate the GO-derived semantic similarity. They are based on the assumption that the more information two terms share, the more similar they are. The shared information is indicated by the information content of the terms that subsume them in DAG. The information content is defined as the frequency of each term, or any of its children, occurring in an annotated dataset. Given the information content of each term, there are several ways to calculate similarity scores between annotated gene products.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

CESSM : Collaborative Evaluation of Semantic Similarity Measures

The application of semantic similarity measures to proteins annotated with Gene Ontology terms has become a common method in bioinformatics. However, the evaluation of these measures is still challenging, since no common standard of evaluation exists. We present an online tool for the automated evaluation of GO-based semantic similarity measures, CESSM, that enables the comparison of new measur...

متن کامل

Correlating Information Contents of Gene Ontology Terms to Infer Semantic Similarity of Gene Products

Successful applications of the gene ontology to the inference of functional relationships between gene products in recent years have raised the need for computational methods to automatically calculate semantic similarity between gene products based on semantic similarity of gene ontology terms. Nevertheless, existing methods, though having been widely used in a variety of applications, may sig...

متن کامل

A-DaGO-Fun: an adaptable Gene Ontology semantic similarity-based functional analysis tool

SUMMARY Gene Ontology (GO) semantic similarity measures are being used for biological knowledge discovery based on GO annotations by integrating biological information contained in the GO structure into data analyses. To empower users to quickly compute, manipulate and explore these measures, we introduce A-DaGO-Fun (ADaptable Gene Ontology semantic similarity-based Functional analysis). It is ...

متن کامل

Improving Semantic Similarity for Proteins based on the Gene Ontology

One of the current challenges in the Life Sciences is to extract the knowledge contained in the vast amount of data that the genomic and post-genomic techniques are producing. One of the major efforts in this area was the development of the Gene Ontology (GO), a BioOntology that contains terms that describe gene products, organized in a graph structure. Gene products annotated with ontology ter...

متن کامل

Disjunctive shared information between ontology concepts: application to Gene Ontology

BACKGROUND The large-scale effort in developing, maintaining and making biomedical ontologies available motivates the application of similarity measures to compare ontology concepts or, by extension, the entities described therein. A common approach, known as semantic similarity, compares ontology concepts through the information content they share in the ontology. However, different disjunctiv...

متن کامل

Bi-directional semantic similarity for gene ontology to optimize biological and clinical analyses

BACKGROUND Semantic similarity analysis facilitates automated semantic explanations of biological and clinical data annotated by biomedical ontologies. Gene ontology (GO) has become one of the most important biomedical ontologies with a set of controlled vocabularies, providing rich semantic annotations for genes and molecular phenotypes for diseases. Current methods for measuring GO semantic s...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2007